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Abstract — We study the problem of optimizing a graph- 
structured objective function under adversarial uncertainty. This 
problem can be modeled as a two-persons zero-sum game 
between an Engineer and Nature. The Engineer controls a subset 
of the variables (nodes in the graph), and tries to assign their 
values to maximize an objective function. Nature controls the 
complementary subset of variables and tries to minimize the same 
objective. This setting encompasses estimation and optimization 
problems under model uncertainty, and strategic problems with 
a graph structure. Von Neumann's minimax theorem guarantees 
the existence of a (minimax) pair of randomized strategies that 
provide optimal robustness for each player against its adversary. 

We prove several structural properties of this strategy pair 
in the case of graph-structured payoff function. In particular, 
the randomized minimax strategies (distributions over variable 
assignments) can be chosen in such a way to satisfy the Markov 
property with respect to the graph. This significantly reduces 
the problem dimensionality. Finally we introduce a message 
passing algorithm to solve this minimax problem. The algorithm 
generalizes max-product belief propagation to this new domain. 

I. Introduction 

A two-persons zero-sum game in normal form is specified 
by an objective (or utility) function O : (x.9) i— > 0(x,8), 
whereby x £ X is the strategy of the first player (which 
we shall call by convention the Engineer), while 9 £ O is 
the strategy of the second player (Nature). Once the strategy 
pair (x, 9) is chosen, the Engineer earns from Nature an 
amount 0(x, 9). The two players optimize their strategies with 
respect to the opposite objective of maximizing (Engineer) or 
minimizing (Nature) the objective. Zero-sum games capture 
strategic situations in which agents compete for a fixed, limited 
pool of resources GTl . 0, ifTTl . Remarkably, they have found 
broad applicability beyond economic theory, including areas 
such as online prediction and learning [6], and statistical 
decision theory |2|. Here statistical estimation is viewed as 
a game between a Statistician (who tries to design the best 
statistical procedure) and Nature (who chooses the worst 
parameters). 

Closer to our motivation, a large variety of optimal design 
problems in engineering can be reduced to maximizing an 
appropriate objective function. The form of this function 
is normally dictated by a model of the underlying system, 
with parameters to be estimated empirically. Of course the 
parameter estimation process is inherently imprecise and, 
more importantly, any model of a real system necessarily 
overlooks a multitude of effects. This remark has motivated the 
burgeoning fields of robust optimization and robust control [Tj. 
In this context, one considers a family of objective functions 



x >-> 0(x; 9), with x the design variables, and 9 £ a vector 
of parameters. Rather than designing for a 'nominal' £ G, 
one then tries to maximize the worst case cost ming e g 0(x; 9). 
The problem is hence reduced to a two-players zero-sum game. 

Robust optimization theory provides a wealth of structural 
information, and efficient algorithms for classes of objective 
functions 0( ■ , 9) : x H> 0(x, 9) that are convex in the control 
variables x. The present paper takes a complementary point 
of view. We assume that both x and 9 take values in high- 
dimensional, discrete spaces. Explicitly, x = x £ X v and 
9 = 9 £ Q F where X, Q are finite alphabets and V, F are 
finite index sets. Letting \V\ = n and \F\ — m, a pair of pure 
strategies is specified by two vectors: the Engineer controls 
variables x ~ (x\, X2, ■ ■ ■ , x n ) indexed by the elements of V, 
while Nature controls parameters 9 — [6-y, O2, ■ ■ ■ , O n ). Within 
this setting we aim at finding strategies for the Engineer which 
are optimally robust with respect to Nature. 

Of course, this general problem is NP-hard, and indeed so 
even in absence of any adversary (since it includes MaxSat 
as a special case). Our approach is to exploit simplifications 
that follow from the underlying factorization structure of the 
objective function. More precisely, we shall assume that the 
objective 0(x, 9) can be expressed as a sum of terms which 
are local on a graph G — (V, F, E), whereby V is the set of 
nodes controlled by the Engineer, F the set of nodes controlled 
by Nature, and E the edge set. 

Graph-structured objective functions naturally arise from 
probabilistic graphical models [15|. In particular, if /i(x) 
is the probability of configuration x under a probabilistic 
graphical model, then \og[i(x) is an objective function that 
factors additively. Hence MAP estimation falls in the class of 
optimization problems considered here. With a slight abuse of 
terminology, we shall use the term 'graphical model' to refer 
to general graph-structured objective functions, even if these 
are not originated from probability distributions. 

The application to graphical models also clarifies the need 
for robustness. Graphical models are particularly effective at 
expressing complex relationships. Think for instance to the 
subtle relationships between diseases and symptoms in a med- 
ical diagnostic systems |16|. Such relationships are normally 
modeled through simple parametric families of conditional 
probabilities (e.g. logit or noisy OR). However, it is not 
expected that these parametric expressions coincide with the 
'true' conditional distributions. The only solid justification for 
this methodology is that the resulting predictions are robust 
with respect to the details of the model itself. Robustness 
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Fig. 1. Engineer's objective vs. A for the Ising model example. 



is therefore implicitly assumed, but has never been carefully 



investigated and accounted for (but see Section IV for related 
work). 

Apart from the use of graphical models, we achieve sig- 
nificant structural simplification by convexifying the space of 
strategies, i.e. introducing randomized (mixed) strategies. This 
is a well established path within game theory. The Engineer 
has at her disposal a stochastic device generating strategy 
x € X v with probability p(x), and plays it, while Nature plays 
strategy 9 E Q F with probability q(0). The Engineer tries to 
maximize the expected utility E^ p ^{0(x, 9)}, while Nature 
tries to minimize the same quantity. A crucial consequence is 
that the problems faced by the two players become dual linear 
programs (LPs). In particular, the celebrated Von Neumann's 
minimax theorem ensures the existence of a saddle point, i.e. 
a strategy pair (p* , q* ) such that for any other strategies p, q 

E( P , g .){Ofe,g)}<E (p .,,. ) {Ofe,e)}<E (p . ld {Ofefl)}. (1) 



This is equivalent to requiring that (p* , q* ) forms 
a Nash equilibrium. The saddle point condition implies 
in particular that the order of play does not matter: 
max p mingE(p )S ){0(iE, 6)} = mm q ma.x p E,^ pq ^{0(x, 9)}. In 
words, p* provides to the Engineer optimal robustness against 
Nature's adversarial choice, and indeed the same as if this 
choice was known in advance. Remarkably, the worst-case 
expected utility of strategy p* is in general strictly larger than 
the utility of any pure strategy. 

Notice that convexification of the strategies space is 
achieved at the expense of an exponential blow-up in dimen- 
sionality. While a pure strategy for the Engineer is a (discrete) 
vector of length n, a mixed strategy is a (probability) vector of 
length \X\ n . Hence, by itself, convexification does not reduce 
the problem complexity. 

In the next section we will illustrate key ideas and questions 
on a simple example, and then describe our general formalism 
and contributions. In Section [Til] we derive a message passing 
algorithm, called ROBUST Max-Product to construct min- 
imax strategies. Finally, we review related work in Section 
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Fig. 2. Ising model example; the solid curves indicate the Engineer's expected 
payoff and the scattered points show the payoff for random instantiations of 
mixed strategies. For each value of a, there are 50 random instantiations of 
mixed strategies. 



II. An example and main contributions 

Ising models. The Ising model is a pairwise graphical model 
with binary alphabet X = {— 1, +1}. The unnormalized log 
probability O(x,0) = logfi(x) + const, can be written as 



Ipij {x^Xj , 9ij^ — 9ijXiXj : — 9iXi- 



(2) 



In practice, the parameters 9 are learned from the data and 
hence are inaccurate. The Engineer's challenge is to find an 
strategy p*(x) which maximize E( p ( x ), q *(e))0{x, 9) for the 
worst case distribution q* of the uncertain parameter 9. 

Minimax strategies depend on the domain of 9. We con- 
sidered a family of models with parameters h, A > 0. 
Single variable potentials have 9i ~ U[— h, h], i.e., 9i is 
uniformly distributed between — h and h, and edges belong 
to two classes: positive and negative. Positive edges have 
9ij € {+1 — A,+l,+l + A} while negative edges have 

B iS e{-l-A,-l,-l + A}. 

Finding minimax strategies for this model is NP-hard, even 
in the special case of A = 0, h = 0. Indeed, when all edges 
are negative, this reduces to the MAXCUT problem. 

We consider a random tree with n = 93 nodes as the 
underlying graph and perform the following experiments. 

First Experiment: We apply the ROBUST MAX-PRODUCT 



ED 



algorithm derived in Section III to find the Engineer's optimum 
strategy, p* . We further compare it with the case the Engineer 
ignores Nature and apply the classical Max-Product to the 
graph with nominal values; namely 0,j = +1 on positive 
edges and —1 on negative edges. Finally, in order to check 
the convergence of ROBUST Max-Product , we compute 
p* by solving the minimax optimization problem in cvx|12|. 
Note that the latter is computationally much more expensive 
than the Robust Max-Product and Max-Product . 
Figures [T] summarizes the results for different values of A. 
Robust Max-Product was run for 100 iterations. For 
A = 0, Max-Product and Robust Max-Product are 



equivalent as Nature has no power. As A increases, ROBUST 
Max-Product performs increasingly better compared to 
Max-Product . 

Second Experiment: Let p* be the Engineer's strategy given 
by Robust Max-Product algorithm, and let p* be the 
one obtained by applying Max-Product considering the 
nominal values of the parameters. Finally, denote by q* and q* , 
Nature's best responses against p* and p*. In this experiment, 
we compare the performance of strategies p* and p* , when 
Nature deviates from her optimal strategies q* ,q* . More 
specifically, Nature's strategy is chosen to be a mixture of her 
optimum strategy and the uniform distribution, i.e., q(6) = 
(1 - a)q* (d) + a\e\- m and q(d) = (1 - a)q* (6) + a|6|- m 
Figure [2] illustrates the Engineer's payoff for the pairs of 
strategies (p*,q(9)) and (p*,q(0)), as a varies. Here, a = 
corresponds to the case Nature chooses her optimum strat- 
egy. Robust Max-Product outperforms the simple Max- 
Product in this regime. As a increases, Nature changes 
from being adversarial to completely random for a = 1. Max- 
Product outperforms Robust Max-Product in the latter 
case, since Nature is no-longer an adversary and one can 
design for nominal values. 

A. Main contributions 

We consider a general bipartite graph (or factor graph) G = 
(V, F, E), where nodes in V (variable nodes, to be denoted by 
i,j,k, . . .) are controlled by the Engineer, nodes in F (factor 
nodes, denoted by a, 6, c, . . . ) are controlled by Nature, and 
E C V x F is a set of undirected edges. Given i € V , the 
set of its neighbors is denoted by di = {a 6 F : (i, a) 6 
E}. The neighborhood of a € F , denoted by da, is defined 
analogously. 

The objective function O : X v x Q F —> R factors on 
graph G iQ there exists a set of functions ip = {ip a : a £ F}, 
ip a : X da x64l such that 



The functions ip a are called potentials. There is no loss of 
generality in assuming G to be bipartite. If two nodes i, 
j controlled by the same player were neighbors, we could 
replace them by a single node with strategy space X' = X x X. 

As discussed above, our goal is to find a pair (p* , q* ) where 
p* is a probability distribution over X v , q* a distribution over 
Q F , and the pair satisfies the Nash equilibrium condition ([TJ. 
From the point of view of the Engineer, this amounts to solving 
the problem 



p* = argmaxminE(„ g)0(x, 



(3) 



The support supp(p) of a probability distribution p is the 
smallest set S such that p(S c ) = 0. 

'While slightly more general definitions (symmetric in x and 6) are 
possible, we stick to the present one because it is already rich enough to 
discuss all the key challenges. 



Since E( p ^O(x,9) is linear both in p and q, which belong 
to the simplex, ([3]) is equivalent to an LP problem. However, 
the dimensionality of this problem is exponential in the graph 
size: even writing down the strategy takes exponential time. 
The following result plays a key role in our approach. 

Theorem II.l. In problem |3]l, without loss of generality, we 
can assume that the Engineer's strategy is a Markov Random 
Field (MRF) with factor graph G, and that Nature chooses 
a product distribution. Explicitly, the Engineer's strategy can 
be assumed to take the form p*(x) — YiaeF fai^Lda)' while 
Nature's strategy takes the form q*(9) = IlaeF 9a(@a)- 

Proof: Note that the Engineer's pay off is given by 



Therefore, only the marginals q a (0 a ) of the Nature's distri- 
bution play role in the pay off. Hence, we can assume that 
Nature has a product distribution q(6) =Yl aeF <la(Qa)- 

Similarly, only the marginals Pa{xg a ) appears in the pay 
off. Thereby, without loss of generality we can assume that 
the Engineer's distribution is an MRF with respect to G. This 
follows from the fact that for all factor graphs G and for all 
joint distributions p(x), there exists a distribution p(x) that is 
representable as an MRF with graph G such that p a {xQ a ) = 

Pa{xQa) ED- ■ 

Notice that, for a graph G with bounded degree, a MRF can 
be specified by 0(|V|) parameters. In particular, the MRF 
is completely specified by the marginals p a {xQ a ). We then 
reformulate the minimax problem as the one of computing 
the minimax marginals {p*a]a£F- By definition, these belong 
to the so-called marginal polytope 



MARG(G) = {{p Q W| Pa(x 9a ) = ]T p(x), 

for some distribution p(x) j . 
Problem[3]can therefore be restated as an LP over MARG(G): 



maximize 

S.t. 



a£F 

{PajaeF € MARG(G) . 



(4) 



Here we used the fact that the min over q in the simplex 
is necessarily achieved at one extremal point, i.e. at a pure 
strategy. In general, MARG(G) does not possess a polynomial 
separation oracle and therefore this problem is not tractable. 
Instead, we relax it to the set of locally consistent marginals 
on G, denoted by LOC(G) 



LOC(G) 



{Pa}a 



EF 



3pi(xi) : Pi{xi) > 0, ^2pi(xi) = 1 

Pa{x da ) > 0, 



E Pa{x da ) =Pi{Xi) 

.Oa\i 



(5) 



We then have the following relaxation of problem Q to the 
local polytope. 

(P ) : maximize V mini Vp^aJ^fe,,; a ) 



s.t. 



a£F 

{PajaeF e LOC(G) . 



(6) 



If G is a tree, then LOC(G) = MARG(G) and therefore this 
relaxation is exact. 

III. Algorithm 
Here, we first present the alternating direction method of 
multipliers (ADMM) ifTUll . ifTTI algorithm for solving convex 
optimization problems and state a general result regarding 
its convergence properties. Subsequently, we show how the 
optimization problem (Po) can be transformed to conform with 
the general form for the ADMM algorithm. We derive the RO- 
BUST Max-Product algorithm from the transformed variant 
of the problem (Po) and obtain convergence guarantees using 
the result stated for the general case of ADMM algorithms. 

A. ADMM Algorithm 

What follows is a short presentation of the ADMM al- 
gorithm and its properties. The reader interested in a more 
comprehensive treatment can refer to [4|. Consider the opti- 
mization problem 

f(x)+g(z), 
Ax - z = 0. 



minimize 



(7) 



s.t. 



where A G M px ™. The augmented Lag rangian for this problem 
is defined as 

L p (x,z,y) = f(x)+g{z) + y T {Ax~z) + ^p\\Ax-z\\l (8) 

with p > a parameter and || ■ ||2 indicating the £2 norm. 
The ADMM algorithm tries to solve the above optimization 
problem by starting from some initial estimates (2^,3/" = 
0) and performing the following iteration 

x (t+i) =argm i n L p (x,z (t) , 



,(*+!) 



,(*+!) 



argmin L p (x^ t+1 \z, yW) 



(9) 



,(*) 



y K " 1 = y y "' + p{Ax [ 

The update rules in (|9} closely resemble the dual gradient 
descent method where the dual is obtained from the augmented 
Lagrangian. This indeed is the gist of the method of multipli- 
ers. Despite the fact that the primal optimization is done in two 
steps and the augmented Lagrangian is used in place of the 
Lagrangian, the iteration Q provably converges to the solution 
of [7] Formally, assume the optimization problem |7]) has a 
finite optimum value p*. We say the Lagrangian L(x, z, y) has 
a saddle point (x*, z* , y*) if L(x*,z*,y) < L(x* , z* ,y*) < 
L(x,z,y*) for all x, z, and y. Then the following theorem 
holds. 



Theorem III.l. f/TM Theorem 3.1, Theorem 8, ^ Section 
3.2) Assume that the extended real valued functions f(x) and 
g{z) are closed, proper, and convex and the un-augmented 
Lagrangian Lo(x,z,y) has a saddle point. Then 

lim f{x {t) ) +g{z {t) ) -^p* 



lim Ax {t) - z (t) 

t— too 

lim j/ (t) y* 



(10) 



B. Robust Max-Product Algorithm 

Note that the epigraph form of the optimization problem 
(Po) is given by 



minimize E A a 

a£F 

S.t. X a + J2Pa(x da )lpa{x da ;8 a ) > 0, 



Va, 



Pi(Xi) = E Pa(x 9a ), 

S-da.\i 

Pa{x da ) > 0, 
Pi(Xi) > 0, 



V(«, a) G E, Xi 

V(i,a) e E,x da 
Vi € V, Xi 

(11) 

where the minimization is over {A a } aeF , {p a }aeF, {Pijiev 
and LOC(G) is represented in terms of the set of marginals. 
Define the indicator function I(-) as 



l(x) 



if x = TRUE, 
oo if x = FALSE. 



(12) 



Furthermore, let /({A a } a6F , {p a }aeF) = J2aeF f(K,Pa) 
and g{{pi} ie v) = J2 ie v9(Pi) whereby f(X a ,p a ) and g( Pi ) 
are defined as 



f(K,Pa) = K + ^2 I ( A « +^2Pa(x da )^a(x da , 6 a ) > 0) 
+ E KPa(x da )>0), (13) 

and 



~g^) = KPi^S\ x \- 1 ). (14) 

Here, S'*^ 1 is the | X\ — 1 dimensional simplex and p a and pi 
are the |A'|l' 9a l and \X\ dimensional real vectors respectively. 
It is easy to see that the extended real valued functions 

/:M m+£ aeF l*l' a "' _> 

g : Rl v ll*l -> (-oo, +00] 

are closed, convex, and proper. 

Using the functions / and g, the optimization problem ( [TTj ) 
can be restated as 

(P): 

minimize /({A Q } aeF , {p a }aeF) + g{{Pi}iev) 

s.t. Pi(xi)= E Pa(x daV ;xi), V(a, i) € E, 

Vxi £ X, 



TABLE I 

Robust Max-Product Algorithm 



Robust Max-Product: 



Input: Factor graph G(V,F,E), potential functions {ipa(x da , 9 a }a^F 
Output: Local marginals {p a (x da )} a eF 

1. Initialize: 

u (° ) (aj)=0 ) V (a,i) G E, x G X 

2. Update until convergence: 
At the factor nodes: 

/ \ 2 



\pa +1) , Xa +1) \ = argmin K + § J2 E Va{x daV ; Xi) - pf> \xi) - u$ '(x,-) 

sue* v x y 
S.t. A a + X^ ga Pa(9Sd a )lpa(Xd a ;0 a ) > 0, Va, # a 

Pafea) > 0, Vx aa G X^ 

At the variable nodes: 



3. Return: {p a } aeF - 



where the minimization is over {X a } ae p, {p a }aeF, an d 
{Pi}iev- Optimization problem (P) follows the form of the 
general problem in Eq. |7]i and can be solved using the ADMM 
algorithm. The augmented Lagrangian for problem (P) can be 
written as 

Lp ({X a }aeF, {Pa}a£F, {Pi}i£V, {u a i} ( a ,i)eE) = 



aeF 



a\i 'i x i ) d5) 



(a,i)GE,XieX 



+ E P[Pi( X i)- E Po,(XQa\i\ x i) 
(a,i)£E,Xi£X %da\i 

where p is a parameter and u a ,i{xi) are the dual variables. 
Notice that the special form of the constraint results in the 
quadratic penalty being block separable in {p a } a& p. Further- 
more, the function / is also block separable in {p a }a,eF, 
as well. Similarly, the quadratic penalty and the function g 
are separable in {pi}i € y These facts enable us to further 



decompose the first two steps of the ADMM iteration (Eq. 
(|9]l) and perform the optimization at the corresponding check 
and variable node locally. The resulting algorithm is presented 
in Table ITII-Bl 

C. Convergence of the ROBUST MAX-PRODUCT Algorithm 

For (a, i) G E, Xi G X and local marginals {p a } ae p, 
{Pi}iev, define the marginal inconsistency residual r a i(xi) 
as 

r a i{xi) =p l (x i ) - ^2 Pa(x da \i;xi). (16) 

Let r G Rl^^Kl be the vector of residuals defined as 
r = {{r a i(xi)}( a ,i) e E,xiEx)- I" particular, let be the 
marginal inconsistency residual at iteration t of the ROBUST 
Max-Product algorithm. 

Define = J2aeF *a ■> tne cost function at iteration t 
of the Robust Max-Product algorithm and let C* be the 
optimum value of the optimization problem Figures [3] and 



|4 show and 



,(*)! 



as a function of t (iteration) for 



\E\\X\ lli- 

tne Ising model described in Section [Tl] with A = 1. Figure [4] 
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Proof: Consider the optimization problem in Eq. ( fTT| . 
First note that given the potential functions {ifj a (x da )} a ^F 
are bounded, this problem has a strictly feasible point. Also 
it is a linear program. Hence the strong duality holds by 
Slater's theorem [5| and the Lagrangian has a saddle point. 
Let {K}aGF,{P* a }a£F,{Pi}i£V be the values of the primal 
variables at this saddle point. Similarly, let {w*j}(a,i)e-E be 
the values of the dual variables corresponding to the con- 
straints Pi(xi) = J2 X Pa(xg a ) at this saddle point. Then 

—da\i 

{K}aeF,{P* a }aeF,{P*}iev are primal optimal for ([11). In 
particular, J2 a eF Ki = C* ■ Furthermore, it is easy to see 
^({K}aeF,{pl}aeF,{Pi}ievAKi}(a,i)eE) is a saddle 
point of the Lagrangian in Eq. ([15) with p = 0. 



Fig. 3. Engineer's objective vs. iteration. 
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Fig. 4. Average marginal inconsistency vs. iteration. 

shows that the marginals quickly become nearly consistent 
while Fig [3] demonstrates that the objective value converges 
to the optimum value as number of iteration increases to a 
modest number. 

The following theorem provides theoretical guarantees for 
the behavior observed in Figures[3]and[4] In particular, it states 
that the sequence of local marginals in the ROBUST Max- 
Product algorithm converges to a set of locally consistent 
marginals that achieves the optimum payoff for the Engineer. 

Theorem III.2. For any graph G(F, V, E) and set of potential 
functions {tp a (xg a )} ae F the fallowings hold. 

(i) lim r (t ) = 0. 

t— »oo 

(ii) lim C(*) =C*. 

t— >oo 

The proof of this theorem can be obtained by applying the 
result of the following lemma to Theorem |III.1| 

Lemma III.l. . For any graph G(F, V, E) and potential 
functions {ip a (x_g a )} ae F, the optimization problem |6) is fea- 
sible. Furthermore, there exist {A*} a6 i?, {Pa}aeF, {pt}iev> 
and {u al }* {a i)£E such that ({X* a } a eF, {p* a }aeF, {p*}iev, 
{it a i}( a ,^ eE ) is a saddle point of the augmented Lagrangian 



IV. Related work 
Several groups investigated the impact of graphical model 
structure on the computational properties of Nash equilibria 
O, US, Q, 0. In particular, Ortiz and Kearns lfl8ll 
proposed a message passing algorithm (called NashProp) to 
find Nash equilibria. However, as shown in [9 |, the problem of 
computing Nash equilibria is PPAD-complete even on trees. 
Within graphical games studied in this literature, a different 
player controls each vertex of a graph, and a game is played 
along each edge. A single player has at her disposal only 
a small number of pure strategies (typically two), and the 
problem complexity arises because of the large number of 
players. 

Let us emphasize that the present paper studies a very 
different class of models. We consider a small fixed number 
of players (indeed in this paper only two players), but each 
of them has at her disposal a large number of pure strategies. 
The problem complexity is due to the strategies proliferation. 

The motivation for focusing on two-players zero-sum games 
came from their relevance to optimization and inference under 
model uncertainty. A few authors [13], |20| have already an- 
alyzed the sensitivity of message passing algorithms to model 
uncertainty. However these studies assumed a probability 
distribution over model parameters, which is very restrictive 
in a high-dimensional setting, or carried out a perturbation 
analysis, without constructing more robust algorithms. 

ADMM and many related algorithms (Uzawa's algorithm, 
Douglas-Rachford splitting, proximal method, Bregman iter- 
ative methods, etc.) have been around for a few decades. 
However, recent years have seen a surge of interest in these 
algorithms in many fields. The reader can refer to [4J for many 
examples in the field of statistical learning. Closer to the spirit 
of this paper, [19| uses the technique of Bregman projection 
to obtain fractional solution for the maximum a posteriori 
probability (MAP) problem in graphical models. The problem 
addressed in this paper is fundamentally different from this 
work in that we consider the case of adversarial uncertainty 
in the model. 



( fl~5| l with p — and J2aeF = ^* 
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